在 R 中,原子結構如 向量、 矩陣和 陣列 都是 同質的;它們要求所有元素具有相同的資料類型。雖然像 as.vector(X) 或 vec <- c(X) 這樣的函數可以將資料展平,但經常會導致不希望的資料強制轉換。
1. 同質性障礙
當你嘗試將數值資料與字串標籤合併至一個向量時,R 會將所有內容強制轉換為限制最少的類型(通常是字串)。這會破壞你的數值的數學性質。列表解決此問題的方式是作為 遞迴容器 以保留每個元件的獨特身份。
2. 派生複雜性
進階的資料管理需要將元資料與資料值一同儲存。使用 factor() 和 cut() 可將連續變數轉換為分類區間。這些特殊物件攜帶了標準向量無法單獨有效管理的屬性。
3. 整理統計輸出
統計摘要如 次數表 (table())或 交叉列聯表 產生多維資料。單一列表可同時儲存原始向量、已分類的區間,以及最終的 table(incomef, statef) 摘要,讓專案工作區保持整潔且結構化。
main.py
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
QUESTION 1
What happens if you use
vec <- c(1, "Data", TRUE)It creates a list of three elements.
It coerces all elements to character strings.
It throws an error because types are mixed.
It keeps the types separate within the vector.
✅ Correct!
Atomic vectors must be homogeneous; R will coerce types to the 'most complex' type (character) to maintain consistency.❌ Incorrect
Atomic vectors cannot hold mixed types. Use a list if you want to preserve individual data types.QUESTION 2
Which function is used to bin a numeric vector into categorical ranges?
factor()table()cut()tapply()✅ Correct!
The cut() function divides the range of x into intervals and codes the values according to which interval they fall into.❌ Incorrect
While factor() is often used on the result, cut() is the specific function for interval binning.QUESTION 3
What is the primary structural difference between a Matrix and a List?
Matrices are faster; Lists are slower.
Matrices must be homogeneous; Lists can be heterogeneous.
Matrices can only have two dimensions; Lists have unlimited.
There is no structural difference.
✅ Correct!
Matrices require all elements to be the same mode (numeric, character, etc.), while lists are 'recursive' and can hold any combination.❌ Incorrect
The key difference is homogeneity (Matrix) versus heterogeneity (List).QUESTION 4
How would you create a cross-tabulation of income groups by state?
table(incomef, statef)c(incomef, statef)as.vector(incomef, statef)list(incomef, statef)✅ Correct!
The table() function performs frequency counts and cross-tabulations for categorical variables.❌ Incorrect
list() would simply store them separately; table() computes the relationship between them.QUESTION 5
If
X is a matrix, what does as.vector(X) return?A list of the matrix rows.
A single-column matrix.
A flattened atomic vector of the matrix elements.
The dimensions of the matrix.
✅ Correct!
as.vector() strips the attributes (like dimensions) and returns the data as a simple vector.❌ Incorrect
It flattens the matrix into a vector, typically in column-major order.Case Study: Managing Multi-Mode Demographic Research
Data Structuring Challenge
A demographic study collects 'incomes' (numeric) and 'statef' (character). You need to categorize incomes into $10k brackets starting from $35k and then produce a summary that links these brackets to the states, storing all objects in a single container.
Q
1. Write the code to create the income factor using 'cut' with breaks starting at 35 and increasing by 10 for 7 intervals.
Solution:
incomef <- factor(cut(incomes, breaks = 35+10*(0:7)))Q
2. How would you store the raw income data, the state frequency table, and the cross-tabulation in one object named 'research_bundle'?
Solution:
research_bundle <- list(raw = incomes, states = table(statef), cross = table(incomef, statef))